AITopics | task variant

Collaborating Authors

task variant

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

When a language model is optimized for reasoning, does it still show embers of autoregression? An analysis of OpenAI o1

McCoy, R. Thomas, Yao, Shunyu, Friedman, Dan, Hardy, Mathew D., Griffiths, Thomas L.

arXiv.org Artificial IntelligenceOct-3-2024

In "Embers of Autoregression" (McCoy et al., 2023), we showed that several large language models (LLMs) have some important limitations that are attributable to their origins in next-word prediction. Here we investigate whether these issues persist with o1, a new system from OpenAI that differs from previous LLMs in that it is optimized for reasoning. We find that o1 substantially outperforms previous LLMs in many cases, with particularly large improvements on rare variants of common tasks (e.g., forming acronyms from the second letter of each word in a list, rather than the first letter). Despite these quantitative improvements, however, o1 still displays the same qualitative trends that we observed in previous systems. Specifically, o1 -- like previous LLMs -- is sensitive to the probability of examples and tasks, performing better and requiring fewer "thinking tokens" in high-probability settings than in low-probability ones. These results show that optimizing a language model for reasoning can mitigate but might not fully overcome the language model's probability sensitivity.

large language model, machine learning, variant, (19 more...)

arXiv.org Artificial Intelligence

2410.01792

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.62)

Add feedback

Continual Reinforcement Learning with TELLA

Fendley, Neil, Costello, Cash, Nguyen, Eric, Perrotta, Gino, Lowman, Corey

arXiv.org Artificial IntelligenceAug-8-2022

Training reinforcement learning agents that continually learn across multiple environments is a challenging problem. This is made more difficult by a lack of reproducible experiments and standard metrics for comparing different continual learning approaches. Researchers can define and share their own curricula over various learning environments or run against a curriculum created under the DARPA Lifelong Learning Machines (L2M) Program. In the last decade, reinforcement learning (RL) with deep neural networks has been successfully applied in a wide variety of domains (Arulkumaran et al., 2017). In typical RL scenarios, the RL agent learns a single task, defined as a single Partially Observable Markov Decision Process (POMDP).

agent, curriculum, tella, (13 more...)

arXiv.org Artificial Intelligence

2208.04287

Country: North America > United States (0.89)

Genre: Research Report (0.44)

Industry:

Education (0.74)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Allocation of Multi-Robot Tasks with Task Variants

Giacometti, Zakk, Zhang, Yu

arXiv.org Artificial IntelligenceJul-1-2020

Task allocation has been a well studied problem. In most prior problem formulations, it is assumed that each task is associated with a unique set of resource requirements. In the scope of multi-robot task allocation problem, these requirements can be satisfied by a coalition of robots. In this paper, we introduce a more general formulation of multi-robot task allocation problem that allows more than one option for specifying the set of task requirements--satisfying any one of the options will satisfy the task. We referred to this new problem as the multi-robot task allocation problem with task variants. First, we theoretically show that this extension fortunately does not impact the complexity class, which is still NP-complete. For solution methods, we adapt two previous greedy methods for the task allocation problem without task variants to solve this new problem and analyze their effectiveness. In particular, we "flatten" the new problem to the problem without task variants, modify the previous methods to solve the flattened problem, and prove that the bounds still hold. Finally, we thoroughly evaluate these two methods along with a random baseline to demonstrate their efficacy for the new problem.

artificial intelligence, assignment, task variant, (14 more...)

arXiv.org Artificial Intelligence

2007.00777

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.96)

Add feedback

Perturbation Training for Human-Robot Teams

Ramakrishnan, Ramya, Zhang, Chongjie, Shah, Julie

Journal of Artificial Intelligence ResearchJul-31-2017

In this work, we design and evaluate a computational learning model that enables a human-robot team to co-develop joint strategies for performing novel tasks that require coordination. The joint strategies are learned through "perturbation training," a human team-training strategy that requires team members to practice variations of a given task to help their team generalize to new variants of that task. We formally define the problem of human-robot perturbation training and develop and evaluate the first end-to-end framework for such training, which incorporates a multi-agent transfer learning algorithm, human-robot co-learning framework and communication protocol. Our transfer learning algorithm, Adaptive Perturbation Training (AdaPT), is a hybrid of transfer and reinforcement learning techniques that learns quickly and robustly for new task variants. We empirically validate the benefits of AdaPT through comparison to other hybrid reinforcement and transfer learning techniques aimed at transferring knowledge from multiple source tasks to a single target task. We also demonstrate that AdaPT's rapid learning supports live interaction between a person and a robot, during which the human-robot team trains to achieve a high level of performance for new task variants. We augment AdaPT with a co-learning framework and a computational bi-directional communication protocol so that the robot can co-train with a person during live interaction. Results from large-scale human subject experiments (n=48) indicate that AdaPT enables an agent to learn in a manner compatible with a human's own learning process, and that a robot undergoing perturbation training with a human results in a high level of team performance. Finally, we demonstrate that human-robot training using AdaPT in a simulation environment produces effective performance for a team incorporating an embodied robot partner.

algorithm, robot, value function, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.5390

AI Access Foundation

11072

Journal of Artificial Intelligence Research

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > San Mateo County > Menlo Park (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Instructional Material (0.68)

Industry: Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback